local adaptation
- North America > United States (0.28)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Health & Medicine > Consumer Health (0.43)
- Education > Curriculum > Subject-Specific Education (0.41)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- North America > United States (0.28)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Health & Medicine > Consumer Health (0.44)
- Education > Curriculum > Subject-Specific Education (0.41)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Task-unaware Lifelong Robot Learning with Retrieval-based Weighted Local Adaptation
Yang, Pengzhi, Wang, Xinyu, Zhang, Ruipeng, Wang, Cong, Oliehoek, Frans, Kober, Jens
Real-world environments require robots to continuously acquire new skills while retaining previously learned abilities, all without the need for clearly defined task boundaries. Storing all past data to prevent forgetting is impractical due to storage and privacy concerns. To address this, we propose a method that efficiently restores a robot's proficiency in previously learned tasks over its lifespan. Using an Episodic Memory (EM), our approach enables experience replay during training and retrieval during testing for local fine-tuning, allowing rapid adaptation to previously encountered problems. Additionally, we introduce a selective weighting mechanism that emphasizes the most challenging segments of retrieved demonstrations, focusing local adaptation where it is most needed. This framework offers a scalable solution for lifelong learning without explicit task identifiers or implicit task boundaries, combining retrieval-based adaptation with selective weighting to enhance robot performance in open-ended scenarios. Our approach addresses the challenge of lifelong learning without distinct task boundaries. To emulate human learning patterns, we propose a method consisting of three phases: learning, reviewing, and testing. In the learning phase, the robot is exposed to various demonstrations, storing a subset of this data as episodic memory M. This balance between stability and plasticity is crucial as models face sequences of tasks over time.
- Europe > Netherlands > South Holland > Delft (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Education > Educational Setting > Continuing Education (0.87)
- Education > Educational Setting > Online (0.68)
Elastically-Constrained Meta-Learner for Federated Learning
Lan, Peng, Chen, Donglai, Xie, Chong, Chen, Keshu, He, Jinyuan, Zhang, Juntao, Chen, Yonghong, Xu, Yan
Federated learning is an approach to collaboratively training machine learning models for multiple parties that prohibit data sharing. One of the challenges in federated learning is non-IID data between clients, as a single model can not fit the data distribution for all clients. Meta-learning, such as Per-FedAvg, is introduced to cope with the challenge. Meta-learning learns shared initial parameters for all clients. Each client employs gradient descent to adapt the initialization to local data distributions quickly to realize model personalization. However, due to non-convex loss function and randomness of sampling update, meta-learning approaches have unstable goals in local adaptation for the same client. This fluctuation in different adaptation directions hinders the convergence in meta-learning. To overcome this challenge, we use the historical local adapted model to restrict the direction of the inner loop and propose an elastic-constrained method. As a result, the current round inner loop keeps historical goals and adapts to better solutions. Experiments show our method boosts meta-learning convergence and improves personalization without additional calculation and communication. Our method achieved SOTA on all metrics in three public datasets.
Can Fair Federated Learning reduce the need for Personalisation?
Iacob, Alex, Gusmão, Pedro P. B., Lane, Nicholas D.
Federated Learning (FL) enables training ML models on edge clients without sharing data. However, the federated model's performance on local data varies, disincentivising the participation of clients who benefit little from FL. Fair FL reduces accuracy disparity by focusing on clients with higher losses while personalisation locally fine-tunes the model. Personalisation provides a participation incentive when an FL model underperforms relative to one trained locally. For situations where the federated model provides a lower accuracy than a model trained entirely locally by a client, personalisation improves the accuracy of the pre-trained federated weights to be similar to or exceed those of the local client model. This paper evaluates two Fair FL (FFL) algorithms as starting points for personalisation. Our results show that FFL provides no benefit to relative performance in a language task and may double the number of underperforming clients for an image task. Instead, we propose Personalisation-aware Federated Learning (PaFL) as a paradigm that pre-emptively uses personalisation losses during training. Our technique shows a 50% reduction in the number of underperforming clients for the language task while lowering the number of underperforming clients in the image task instead of doubling it. Thus, evidence indicates that it may allow a broader set of devices to benefit from FL and represents a promising avenue for future experimentation and theoretical analysis.
- North America > United States > Virginia (0.05)
- Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (4 more...)
Salvaging Federated Learning by Local Adaptation
Yu, Tao, Bagdasaryan, Eugene, Shmatikov, Vitaly
Federated learning (FL) is a heavily promoted approach for training ML models on sensitive data, e.g., text typed by users on their smartphones. FL is expressly designed for training on data that are unbalanced and non-iid across the participants. To ensure privacy and integrity of the federated model, latest FL approaches use differential privacy or robust aggregation to limit the influence of "outlier" participants. First, we show that on standard tasks such as next-word prediction, many participants gain no benefit from FL because the federated model is less accurate on their data than the models they can train locally on their own. Second, we show that differential privacy and robust aggregation make this problem worse by further destroying the accuracy of the federated model for many participants. Then, we evaluate three techniques for local adaptation of federated models: fine-tuning, multi-task learning, and knowledge distillation. We analyze where each technique is applicable and demonstrate that all participants benefit from local adaptation. Participants whose local models are poor obtain big accuracy improvements over conventional FL. Participants whose local models are better than the federated model and who have no incentive to participate in FL today improve less, but sufficiently to make the adapted federated model better than their local models.
Episodic Memory in Lifelong Language Learning
d'Autume, Cyprien de Masson, Ruder, Sebastian, Kong, Lingpeng, Yogatama, Dani
We introduce a lifelong language learning setup where a model needs to learn from a stream of text examples without any dataset identifier. We propose an episodic memory model that performs sparse experience replay and local adaptation to mitigate catastrophic forgetting in this setup. Experiments on text classification and question answering demonstrate the complementary benefits of sparse experience replay and local adaptation to allow the model to continuously learn from new datasets. We also show that the space complexity of the episodic memory module can be reduced significantly ( 50-90%) by randomly choosing which examples to store in memory with a minimal decrease in performance. We consider an episodic memory component as a crucial building block of general linguistic intelligence and see our model as a first step in that direction.
- North America > United States (0.28)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > France (0.04)
- Africa > Kenya > Nairobi City County > Nairobi (0.04)
- Health & Medicine > Consumer Health (1.00)
- Government (0.68)
- Education > Curriculum > Subject-Specific Education (0.61)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Memory-based Parameter Adaptation
Sprechmann, Pablo, Jayakumar, Siddhant M., Rae, Jack W., Pritzel, Alexander, Badia, Adrià Puigdomènech, Uria, Benigno, Vinyals, Oriol, Hassabis, Demis, Pascanu, Razvan, Blundell, Charles
Deep neural networks have excelled on a wide range of problems, from vision to language and game playing. Neural networks very gradually incorporate information into weights as they process data, requiring very low learning rates. If the training distribution shifts, the network is slow to adapt, and when it does adapt, it typically performs badly on the training distribution before the shift. Our method, Memory-based Parameter Adaptation, stores examples in memory and then uses a context-based lookup to directly modify the weights of a neural network. Much higher learning rates can be used for this local adaptation, reneging the need for many iterations over similar data before good predictions can be made. As our method is memory-based, it alleviates several shortcomings of neural networks, such as catastrophic forgetting, fast, stable acquisition of new knowledge, learning with an imbalanced class labels, and fast learning during evaluation. We demonstrate this on a range of supervised tasks: large-scale image classification and language modelling.